A novel algorithm for rapid speaker adaptation based on structural maximum likelihood eigenspace mapping

نویسندگان

  • Bowen Zhou
  • John H. L. Hansen
چکیده

In this paper, we propose a novel algorithm for rapid speaker adaptation based on our Structural Maximum Likelihood Eigenspace Mapping (SMLEM). The proposed method constructs a binary-tree structured hierarchical Speaker Independent (SI) eigenspace at different levels from well-trained SI system models, and then dynamically constructs a new set of speaker dependent (SD) eigenspaces at corresponding levels, according to the availability of incoming adaptation data. By mapping the mixture Gaussian components from a SI eigenspace to SD eigenspaces in a maximum likelihood manner, the SI models are adapted towards SD models (EM algorithm is used to derive the eigenspace bias). Compared with conventional MLLR, the proposed algorithm is both computationally cheaper and more effective when only a very small amount (from 5 to 15 seconds) of adaptation data is available. In our simulations using the DARPA WSJ Spoke3 corpus, an average of 10.5% relative reduction in WER was achieved over MLLR adaptation when using 5 seconds data for adaptation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Structural Maximum Mapping for Rapid Speak

In this paper, we expand on a previously proposed algorithm entitled Structural Maximum Likelihood Eigenspace Mapping (SMLEM) [5, 6] for rapid speaker adaptation by exploring a variety of model clustering methods and incorporating a multi-stream approach. The SMLEM algorithm directly adapts speaker independent acoustic models to a test speaker by mapping the mixture Gaussian components from a s...

متن کامل

Fast speaker adaptation using eigenspace-based maximum likelihood linear regression

This paper presents an eigenspace-based fast speaker adaptation approach which can improve the modeling accuracy of the conventional maximum likelihood linear regression (MLLR) techniques when only very limited adaptation data is available. The proposed eigenspace-based MLLR approach was developed by introducing a priori knowledge analysis on the training speakers via PCA, so as to construct an...

متن کامل

Eigenspace-based speaker adaptation methods in Persian speech recognition systems

Among speaker adaptation algorithms, eigenvoice (EV) and eigenspace-based MLLR (EMLLR) adaptation approaches have been proposed for rapid adaptation with very limited adaptation data. In these methods, a speaker adapted model is constrained to be a weighted combination of some orthogonal basis vectors. In this manner, both the number of parameters to be estimated from the adaptation data, and t...

متن کامل

Improving robustness of MLLR adaptation with speaker-clustered regression class trees

We introduce a strategy for modeling speaker variability in speaker adaptation based on maximum likelihood linear regression (MLLR). The approach uses a speaker clustering procedure that models speaker variability by partitioning a large corpus of speakers in the eigenspace of their MLLR transformations and learning clusterspecific regression class tree structures. We present experiments showin...

متن کامل

Speaker clustered regression-class trees for MLLR adaptation

A speaker clustering algorithm is presented that is based on an eigenspace representation of Maximum Likelihood Linear Regression (MLLR) transformations and is used for training cluster-dependent regression-class trees for MLLR adaptation. It is shown that significant automatic speech recognition (ASR) system performance gains are possible by choosing the best regression-class tree structure fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001